139 research outputs found
Deep neural network improves the estimation of polygenic risk scores for breast cancer
Polygenic risk scores (PRS) estimate the genetic risk of an individual for a
complex disease based on many genetic variants across the whole genome. In this
study, we compared a series of computational models for estimation of breast
cancer PRS. A deep neural network (DNN) was found to outperform alternative
machine learning techniques and established statistical algorithms, including
BLUP, BayesA and LDpred. In the test cohort with 50% prevalence, the Area Under
the receiver operating characteristic Curve (AUC) were 67.4% for DNN, 64.2% for
BLUP, 64.5% for BayesA, and 62.4% for LDpred. BLUP, BayesA, and LPpred all
generated PRS that followed a normal distribution in the case population.
However, the PRS generated by DNN in the case population followed a bi-modal
distribution composed of two normal distributions with distinctly different
means. This suggests that DNN was able to separate the case population into a
high-genetic-risk case sub-population with an average PRS significantly higher
than the control population and a normal-genetic-risk case sub-population with
an average PRS similar to the control population. This allowed DNN to achieve
18.8% recall at 90% precision in the test cohort with 50% prevalence, which can
be extrapolated to 65.4% recall at 20% precision in a general population with
12% prevalence. Interpretation of the DNN model identified salient variants
that were assigned insignificant p-values by association studies, but were
important for DNN prediction. These variants may be associated with the
phenotype through non-linear relationships.Comment: 28 pages, 7 figures, 2 Table
Recommended from our members
Sphagnum physiology in the context of changing climate: emergent influences of genomics, modelling and host-microbiome interactions on understanding ecosystem function.
Peatlands harbour more than one-third of terrestrial carbon leading to the argument that the bryophytes, as major components of peatland ecosystems, store more organic carbon in soils than any other collective plant taxa. Plants of the genus Sphagnum are important components of peatland ecosystems and are potentially vulnerable to changing climatic conditions. However, the response of Sphagnum to rising temperatures, elevated CO2 and shifts in local hydrology have yet to be fully characterized. In this review, we examine Sphagnum biology and ecology and explore the role of this group of keystone species and its associated microbiome in carbon and nitrogen cycling using literature review and model simulations. Several issues are highlighted including the consequences of a variable environment on plant-microbiome interactions, uncertainty associated with CO2 diffusion resistances and the relationship between fixed N and that partitioned to the photosynthetic apparatus. We note that the Sphagnum fallax genome is currently being sequenced and outline potential applications of population-level genomics and corresponding plant photosynthesis and microbial metabolic modelling techniques. We highlight Sphagnum as a model organism to explore ecosystem response to a changing climate and to define the role that Sphagnum can play at the intersection of physiology, genetics and functional genomics
Recent Advances in the Transcriptional Regulation of Secondary Cell Wall Biosynthesis in the Woody Plants
Plant cell walls provide structural support for growth and serve as a barrier for pathogen attack. Plant cell walls are also a source of renewable biomass for conversion to biofuels and bioproducts. Understanding plant cell wall biosynthesis and its regulation is of critical importance for the genetic modification of plant feedstocks for cost-effective biofuels and bioproducts conversion and production. Great progress has been made in identifying enzymes involved in plant cell wall biosynthesis, and in Arabidopsis it is generally recognized that the regulation of genes encoding these enzymes is under a transcriptional regulatory network with coherent feedforward and feedback loops. However, less is known about the transcriptional regulation of plant secondary cell wall (SCW) biosynthesis in woody species despite of its high relevance to biofuels and bioproducts conversion and production. In this article, we synthesize recent progress on the transcriptional regulation of SCW biosynthesis in Arabidopsis and contrast to what is known in woody species. Furthermore, we evaluate progress in related emerging regulatory machineries targeting transcription factors in this complex regulatory network of SCW biosynthesis
Recommended from our members
A Variable Polyglutamine Repeat Affects Subcellular Localization and Regulatory Activity of a Populus ANGUSTIFOLIA Protein.
Polyglutamine (polyQ) stretches have been reported to occur in proteins across many organisms including animals, fungi and plants. Expansion of these repeats has attracted much attention due their associations with numerous human diseases including Huntington's and other neurological maladies. This suggests that the relative length of polyQ stretches is an important modulator of their function. Here, we report the identification of a Populus C-terminus binding protein (CtBP) ANGUSTIFOLIA (PtAN1) which contains a polyQ stretch whose functional relevance had not been established. Analysis of 917 resequenced Populus trichocarpa genotypes revealed three allelic variants at this locus encoding 11-, 13- and 15-glutamine residues. Transient expression assays using Populus leaf mesophyll protoplasts revealed that the 11Q variant exhibited strong nuclear localization whereas the 15Q variant was only found in the cytosol, with the 13Q variant exhibiting localization in both subcellular compartments. We assessed functional implications by evaluating expression changes of putative PtAN1 targets in response to overexpression of the three allelic variants and observed allele-specific differences in expression levels of putative targets. Our results provide evidence that variation in polyQ length modulates PtAN1 function by altering subcellular localization
Recommended from our members
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery
Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant’s sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes use of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. The resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance
Regulation of Lignin Biosynthesis and Its Role in Growth-Defense Tradeoffs
Plant growth-defense tradeoffs are fundamental for optimizing plant performance and fitness in a changing biotic/abiotic environment. This process is thought to involve readjusting resource allocation to different pathways. It has been frequently observed that among secondary cell wall components, alteration in lignin biosynthesis results in changes in both growth and defense. How this process is regulated, leading to growth or defense, remains largely elusive. In this article, we review the canonical lignin biosynthesis pathway, the recently discovered tyrosine shortcut pathway, and the biosynthesis of unconventional C-lignin. We summarize the current model of the hierarchical transcriptional regulation of lignin biosynthesis. Moreover, the interface between recently identified transcription factors and the hierarchical model are also discussed. We propose the existence of a transcriptional co-regulation mechanism coordinating energy allowance among growth, defense and lignin biosynthesis
Genome-wide analysis of lectin receptor-like kinases in Populus
Transcript level of C-type PtLecRLK gene in 24 different datasets from the Populus Gene Atlas Study. RNA-seq data were collected from the Populus Gene Atlas Study in Phytozome v11.0 ( http://phytozome.jgi.doe.gov/pz/portal.html ). The transcript level was expressed as FPKM. The sheet labeled as “whole_set” contains the original FPKM values from Gene Atlas. The data of four different tissues under standard condition are sorted in the data sheet labeled as “standard”. (XLSX 10 kb
- …